Blind Search for Atari-Like Online Planning Revisited

نویسندگان

  • Alexander Shleyfman
  • Alexander Tuisov
  • Carmel Domshlak
چکیده

Similarly to the classical AI planning, the Atari 2600 games supported in the Arcade Learning Environment all feature a fully observable (RAM) state and actions that have deterministic effect. At the same time, the problems in ALE are given only implicitly, via a simulator, a priori precluding exploiting most of the modern classical planning techniques. Despite that, Lipovetzky et al. [2015] recently showed how online planning for Atari-like problems can be effectively addressed using IW(i), a blind state-space search algorithm that employs a certain form of structural similarity-based pruning. We show that the effectiveness of the blind statespace search for Atari-like online planning can be pushed even further by focusing the search using both structural state similarity and the relative myopic value of the states. We also show that the planning effectiveness can be further improved by considering online planning for the Atari games as a multiarmed bandit style competition between the various actions available at the state planned for, and not purely as a classical planning style action sequence optimization problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classical Planning with Simulators: Results on the Atari Video Games

The Atari 2600 games supported in the Arcade Learning Environment [Bellemare et al., 2013] all feature a known initial (RAM) state and actions that have deterministic effects. Classical planners, however, cannot be used off-the-shelf as there is no compact PDDL-model of the games, and action effects and goals are not known a priori. Indeed, there are no explicit goals, and the planner must sele...

متن کامل

Classical Planning Algorithms on the Atari Video Games

The Atari 2600 games supported in the Arcade Learning Environment (Bellemare et al. 2013) all feature a known initial (RAM) state and actions that have deterministic effects. Classical planners, however, cannot be used for selecting actions for two reasons: first, no compact PDDL-model of the games is given, and more importantly, the action effects and goals are not known a priori. Moreover, in...

متن کامل

Width-Based Planning for General Video-Game Playing

IW(1) is a simple search algorithm that assumes that states can be characterized in terms of a set of boolean features or atoms. IW(1) consists of a standard breadth-first search with one variation: a newly generated state is pruned if it does not make a new atom true. Thus, while a breadth-first search runs in time that is exponential in the number of atoms, IW(1) runs in linear time. Variatio...

متن کامل

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning

The combination of modern Reinforcement Learning and Deep Learning approaches holds the promise of making significant progress on challenging applications requiring both rich perception and policy-selection. The Arcade Learning Environment (ALE) provides a set of Atari games that represent a useful benchmark set of such applications. A recent breakthrough in combining model-free reinforcement l...

متن کامل

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games

Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential decision-making problems such as Go and video games, but their performance can be poor when the planning depth and sampling trajectories are limited or when the rewards are sparse. We present an adaptation of PGRD (policy-gradient for rewarddesign) for learning a reward-bonus function to improve UCT (a MCTS a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016